Improving Distributed Representation of Word Sense via WordNet Gloss Composition and Context Clustering
نویسندگان
چکیده
In recent years, there has been an increasing interest in learning a distributed representation of word sense. Traditional context clustering based models usually require careful tuning of model parameters, and typically perform worse on infrequent word senses. This paper presents a novel approach which addresses these limitations by first initializing the word sense embeddings through learning sentencelevel embeddings from WordNet glosses using a convolutional neural networks. The initialized word sense embeddings are used by a context clustering based model to generate the distributed representations of word senses. Our learned representations outperform the publicly available embeddings on 2 out of 4 metrics in the word similarity task, and 6 out of 13 sub tasks in the analogical reasoning task.
منابع مشابه
A Gloss Composition and Context Clustering Based Distributed Word Sense Representation Model
In recent years, there has been an increasing interest in learning a distributed representation of word sense. Traditional context clustering based models usually require careful tuning of model parameters, and typically perform worse on infrequent word senses. This paper presents a novel approach which addresses these limitations by first initializing the word sense embeddings through learning...
متن کاملImproving Word Sense Discrimination with Gloss Augmented Feature Vectors
This paper presents a method of unsupervised word sense discrimination that augments co–occurrence feature vectors derived from raw untagged corpora with information from the glosses found in a machine readable dictionary. Each content word that occurs in the context of a target word to be discriminated is represented by a co-occurrence feature vector. Each of these vectors is augmented with th...
متن کاملExtending and Improving Wordnet via Unsupervised Word Embeddings
This work presents an unsupervised approach for improving WordNet that builds upon recent advances in document and sense representation via distributional semantics. We apply our methods to construct Wordnets in French and Russian, languages which both lack good manual constructions.1 These are evaluated on two new 600-word test sets for word-to-synset matching and found to improve greatly upon...
متن کاملClustering WordNet Senses Utilizing Modified and Novel Similarity Metrics CS 229 Final Project Report
Introduction We approach the problem of clustering senses in Princeton's WordNet (Fellbaum 1998), a manually created dictionary/thesaurus which attempts to model the structure underlying human concepts. A synset, the fundamental unit in WordNet, is represented by a group of synonyms and a gloss definition, and is connected through a variety of semantic links, such as hypernyms (type-of) or mero...
متن کاملA gloss-centered algorithm for disambiguation
The task of word sense disambiguation is to assign a sense label to a word in a passage. We report our algorithms and experiments for the two tasks that we participated in viz. the task of WSD of WordNet glosses and the task of WSD of English lexical sample. For both the tasks, we explore a method of sense disambiguation through a process of “comparing” the current context for a word against a ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015